61 research outputs found

    Tackling scalability issues in mining path patterns from knowledge graphs: a preliminary study

    Get PDF
    Features mined from knowledge graphs are widely used within multiple knowledge discovery tasks such as classification or fact-checking. Here, we consider a given set of vertices, called seed vertices, and focus on mining their associated neighboring vertices, paths, and, more generally, path patterns that involve classes of ontologies linked with knowledge graphs. Due to the combinatorial nature and the increasing size of real-world knowledge graphs, the task of mining these patterns immediately entails scalability issues. In this paper, we address these issues by proposing a pattern mining approach that relies on a set of constraints (e.g., support or degree thresholds) and the monotonicity property. As our motivation comes from the mining of real-world knowledge graphs, we illustrate our approach with PGxLOD, a biomedical knowledge graph

    Formal Concept Analysis for the Interpretation of Relational Learning applied on 3D Protein-Binding Sites

    Get PDF
    International audienceInductive Logic Programming (ILP) is a powerful learning method which allows an expressive representation of the data and produces explicit knowledge. However, ILP systems suffer from a major drawback as they return a single theory based on heuristic user-choices of various parameters, thus ignoring potentially relevant rules. Accordingly, we propose an original approach based on Formal Concept Analysis for effective interpretation of reached theories with the possibility of adding domain knowledge. Our approach is applied to the characterization of three-dimensional (3D) protein-binding sites which are the protein portions on which interactions with other proteins take place. In this context, we define a relational and logical representation of 3D patches and formalize the problem as a concept learning problem using ILP. We report here the results we obtained on a particular category of protein-binding sites namely phosphorylation sites using ILP followed by FCA-based interpretation

    Découverte d'associations entre Evénements Indésirables Médicamenteux par les structures de patrons et les ontologies

    Get PDF
    National audienceOn présente dans cet article une méthode d'extraction d'associations entre événements indé-sirables médicamenteux (EIM) utilisant les structures de patrons. Cette méthode permet une comparaison d'EIM fondée sur des ontologies biomédicales et a déjà été présentée avec une application sur des dossiers médicaux électroniques. Une application sur un autre type de données utilisant d'autres ontologies biomédi-cales est présentée ici, ainsi qu'une comparaison et une évaluation plus détaillée des règles extraites. Cette méthode se révèle flexible puisque pouvant être appliquée à divers jeux de données et ontologies, et capable d'extraire des règles d'associations entre EIM avec une représentation expressive de ces EIM

    Integrative relational machine-learning for understanding drug side-effect profiles

    Get PDF
    International audienceBackgroundDrug side effects represent a common reason for stopping drug development during clinical trials. Improving our ability to understand drug side effects is necessary to reduce attrition rates during drug development as well as the risk of discovering novel side effects in available drugs. Today, most investigations deal with isolated side effects and overlook possible redundancy and their frequent co-occurrence.ResultsIn this work, drug annotations are collected from SIDER and DrugBank databases. Terms describing individual side effects reported in SIDER are clustered with a semantic similarity measure into term clusters (TCs). Maximal frequent itemsets are extracted from the resulting drug x TC binary table, leading to the identification of what we call side-effect profiles (SEPs). A SEP is defined as the longest combination of TCs which are shared by a significant number of drugs. Frequent SEPs are explored on the basis of integrated drug and target descriptors using two machine learning methods: decision-trees and inductive-logic programming. Although both methods yield explicit models, inductive-logic programming method performs relational learning and is able to exploit not only drug properties but also background knowledge. Learning efficiency is evaluated by cross-validation and direct testing with new molecules. Comparison of the two machine-learning methods shows that the inductive-logic-programming method displays a greater sensitivity than decision trees and successfully exploit background knowledge such as functional annotations and pathways of drug targets, thereby producing rich and expressive rules. All models and theories are available on a dedicated web site.ConclusionsSide effect profiles covering significant number of drugs have been extracted from a drug ×side-effect association table. Integration of background knowledge concerning both chemical and biological spaces has been combined with a relational learning method for discovering rules which explicitly characterize drug-SEP associations. These rules are successfully used for predicting SEPs associated with new drugs

    Investigating ADR mechanisms with Explainable AI: a feasibility study with knowledge graph mining

    Get PDF
    National audienceAdverse drug reactions (ADRs) are statistically characterized within randomized clinical trials or by postmarketing pharmacovigilance. However, the molecular mechanisms causing ADRs remain unknown in most cases. This is true even for common toxicities that are classically monitored during trials such as hepatic or skin toxicities. Interestingly, many elements of knowledge about drugs and drug ingredients are available beside clinical trials. In particular, open-access knowledge graphs describe their properties, interactions, and involvements in pathways. Expert classifications have also been manually established by experts and label drugs either as causative or not for several types of ADRs. In our paper, we propose to mine biomedical knowledge graphs to identify biomolecular features that enable to automatically reproduce such expert classifications, distinguishing drugs causative or not for a given type of ADR. In an Explainable AI perspective, we explore simple classification techniques such as Decision Trees and Classification Rules because they provide human-readable models which explain the classification itself. We also evaluate the assumption that biomolecular features mined from knowledge graphs might provide elements of explanation for the molecular mechanisms behind ADRs

    Une exposition à de faibles doses d'alkylphénols entraine des altérations de épithélium mammaires et des défauts transgénérationnels mais n'augmente pas le potentiel tumorigénique des cellules cancéreuses mammaires

    Get PDF
    International audienceFetal and neonatal exposure to long chain alkylphenols has been suspected to promote breast developmental disorders and consequently to increase breast cancer risk. However, disease predisposition from developmental exposures remains unclear. In this work, human MCF-10A mammary epithelial cells were exposed in vitro to a low dose of a realistic [4-nonylphenol+4-tert-octylphenol] mixture. Transcriptome and cell phenotype analyses combined to functional and signaling network modeling indicated that long chain alkylphenols triggered enhanced proliferation, migration ability and apoptosis resistance and shed light on the underlying molecular mechanisms which involved the human estrogen receptor variant ERα36. A male mouse inherited transgenerational model of exposure to 3 environmentally relevant doses of the alkylphenol mix was set up in order to determine whether and how it would impact on mammary gland architecture. Mammary glands from F3 progeny obtained after intrabuccal chronic exposure of C57BL/6J P0 pregnant mice followed by F1 to F3 male inheritance displayed an altered histology which correlated with the phenotypes observed in vitro in human mammary epithelial cells. Since cellular phenotypes are similar in vivo and in vitro and involve the unique ERα36 human variant, such consequences of alkylphenol exposure could be extrapolated from mouse model to human. However, transient alkylphenol treatment combined to ERα36 overexpression in mammary epithelial cells were not sufficient to trigger tumorigenesis in xenografted Nude mice. Therefore, it remains to be determined if low dose alkylphenol transgenerational exposure and subsequent abnormal mammary gland development could account for an increased breast cancer susceptibility
    corecore